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We propose a method to search for signs of causal structure in spatiotemporal data making mini- 
mal a priori assumptions about the underlying dynamics. To this end, we generalize the elementary 
concept of recurrence for a point process in time to recurrent events in space and time. An event is 
defined to be a recurrence of any previous event if it is closer to it in space than all the intervening 
events. As such, each sequence of recurrences for a given event is a record breaking process. This 
definition provides a strictly data driven technique to search for structure. Defining events to be 
nodes, and linking each event to its recurrences, generates a network of recurrent events. Significant 
deviations in statistical properties of that network compared to networks arising from (acausal) 
random processes allows one to infer attributes of the causal dynamics that generate observable cor- 
relations in the patterns. We derive analytically a number of properties for the network of recurrent 
events composed by a random process in space and time. We extend the theory of records to treat 
not only the variable where records happen, but also time as continuous. In this way, we construct 
a fully symmetric theory of records leading to a number of new results. Those analytic results are 
compared in detail to the properties of a network synthesized from time series of epicenter locations 
for earthquakes in Southern California. Significant disparities from the ensemble of acausal networks 
that can be plausibly attributed to the causal structure of seismicity are: (1) Invariance of network 
statistics with the time span of the events considered, (2) Appearance of a fundamental length scale 
for recurrences, independent of the time span of the catalog, which is consistent with observations 
of the "rupture length" , (3) Hierarchy in the distances and times of subsequent recurrences. As ex- 
pected, almost all of the statistical properties of a network constructed from a surrogate in which the 
original magnitudes and locations of earthquake epicenters are randomly "shuffled" are completely 
consistent with predictions from the acausal null model. 

PACS numbers: 02.50.-r,05.65.+b,91.30.Dk,05.45.Tp 



I. INTRODUCTION 



Many striking features of physical, biological or so- 
cial processes can be portrayed as patterns or clus- 
ters of localized events. These can be flips of mag- 
netic domains in a ferromagnet leading to Barkhausen 
noise P, traffic jams 3[, booms and busts of mar- 
kets and economies H|, forest fires [f|, the spread 
of infections 01 and global pandemics, extinctions of 
species 0,0, ©M, neural spikes [HI, solar flares [HG3, 
or earthquakes [l4, fl5l [l6| - to name a few. A generic 
attribute in all these cases is that one event can trigger or 
somehow induce another one to occur - or possibly nu- 
merous further events. Sometimes, as in the prototype 
sandpile model [17} . an accounting of causes and their 
effects leads to an interpretation in terms of avalanches - 
where causal connections between clustered events ( "top- 
plings") are explicitly rationalized by the microscopic 
state and rules of the dynamical system. More often than 
not, though, the network of causal connections cannot be 
resolved from the data at hand and remains ambiguous. 
Thus, one is often confronted with inferring a plausible 
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causal structure from clusters of localized events without 
a detailed or "fundamental" knowledge of the true micro- 
scopic dynamics. This remains a stubbornly impenetra- 
ble problem despite some progress in special cases (see 
e.g. Ref. [lH and references therein). 

We aim to establish a general procedure of plausible 
inference based on sequences of data in space and time, 
or more generally for any temporal sequence of data. The 
essential idea for the method of analysis discussed here 
is that of a recurrence. Our definition of recurrences is a 
generalization of "returns" for a point process to higher 
dimensional data structures that evolve in time. Loosely 
spoken, a recurrence involves a pair of events which are 
sufficiently close to each other to suggest a causal con- 
nection. 



A. An example of contextual dependence 

For illustration consider the two events: (A) First, Al- 
ice drops a banana, and (B) then Bob falls down. If A 
and B are sufficiently close in space and time then one 
can reasonably infer that it is likely that Bob slipped on 
the banana and fell down ("A caused B"), but should 
these events be sufficiently separated then A is less likely 
to have contributed to B's occurrence. For instance, Bob 
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could have been distracted by the banana, or fell for an- 
other reason related to A without actually slipping di- 
rectly on the banana - so the two events may still be 
connected without A being exclusively the cause of B. 
This secondary effect is also less likely if sufficient time 
has past between the two events. Eventually Alice or an- 
other party may pick up the banana or Bob's fall may 
have happened so far away that it would be unlikely for 
him to have slipped on it. 

As this example shows, it is not always clear what we 
should mean by 'sufficiently close' to infer a causal con- 
nection. One option might be to call a localized event B 
a recurrence of an earlier event A, if its spatial distance 
is less than some chosen length I [ijj. In addition to in- 
troducing a length scale, this choice fails to admit that 
the plausibility of causal connections typically becomes 
weaker with time - as the example above makes plain. In 
addition, the likelihood that the later event (B) may be 
triggered by a third intervening event increases with time 
as well. These considerations might suggest that I should 
shrink with time. On the other hand, the fact that influ- 
ences usually spread either diffusively or with finite speed 
could suggest the opposite - that I increases with time. 
Spreading of influence is hypothesized, for instance, in 
theories of "aftershock zone diffusion" (see Ref. [2(| and 
references therein). Other, more complicated scenarios 
are also conceivable. 

This discussion is meant to clarify that without suf- 
ficiently accurate a priori knowledge of the underlying 
microscopic dynamics any definition of closeness based 
on predefined scales is arbitrary and might significantly 
alter the inferred causal structure. To avoid this prob- 
lem, or more generally to minimize the influence of the 
observer, we take the view that, to begin with, a suitable 
definition of closeness ought to be purely contextual, and 
depend only on the actual history of events. Taking this 
as our starting point - that we know the observed his- 
tory of events but do not know the underlying dynamics - 
we propose a contextual method to establish recurrences 
that uses 'zero knowledge' of the underlying physical pro- 
cesses. As a result, our definition is generic and can ap- 
ply to a wide variety of situations. This approach serves 
as a starting point to analyze data for systems where 
the underlying dynamics is obscure, mysterious or even 
misconceived. It comprises a fundamental extension of 
the concept of recurrences for a point process to recur- 
rent events in space and time that allows the inference of 
causal relations from available or possible observations. 



B. Contextual relationships represented by a 
network 

In the approach described here, the inferred relation- 
ship between each pair of events is based on the closeness 
of the pair relative to all the other events that have oc- 
curred in the data set. An event B is designated to be 
a recurrence of a previous one A if it is closer to A - 



compared to any other event occurring in the time inter- 
val between A and B. By this construction, each recur- 
rence is a new "record" in the sequence of distances that 
subsequent events have from A. In other words, each 
recurrence is a record breaking event [2lL I22I |23| . 

This method of inferring relationships between pairs of 
events is naturally expressed as a network of connected 
events where each event is a node in the graph, and each 
recurrent pair is linked with a time directed edge. Sig- 
nificant deviations in the statistics of the resulting net- 
work from that for a random process (which lacks any 
causal relations between events) highlights relevant parts 
of the causal dynamical process(es) generating the pat- 
terns. In principle, the events themselves do not have to 
take place in real physical space, but can occur in any 
space as long as it is equipped with a metric that de- 
fines distances. As a starting point, here we only discuss 
spatiotemporal point processes and take as our test bed 
a well-characterized, extensive and comparatively accu- 
rate catalog [24[ of earthquake epicenters for Southern 
California. 



C. Outline 

Section II explains our method for constructing net- 
works of recurrent events and the relation to record 
breaking statistics. In Section III, the null hypothesis 
of independent, random events is introduced and a num- 
ber of analytic results are obtained for it. We extend the 
mathematical theory of record breaking statistics to the 
case where both space (or the variable which fluctuates 
and in which records take place) and time (or the order- 
ing of events) are treated on the same footing. Treating 
both space and time as continuous symmetrizes the the- 
ory - making it more concise. These results allow us to 
discover statistical features in the actual network of re- 
currences that are unlikely in acausal random processes 
and, hence, plausibly due to causal structures in the un- 
derlying dynamics. 

Section IV describes the application to seismicity. The 
network analysis reveals new statistical features of seis- 
micity — with robust scaling laws that are invariant over 
a range of different time scales. This apparent invariance 
with respect to the time span is diametrically opposed 
to the behavior for a random process, where all statis- 
tical distributions depend explicitly on the time span 
over which events are recorded. The rupture length and 
its scaling with magnitude (while being invariant with 
respect to the time span of the history) emerges from 
the data analysis without being predefined by the mea- 
surement process. It is a generic measure for distance 
between recurrent events. These results indicate that 
our method is, indeed, tending to identify causally re- 
lated events rather than acausal pairs. Further, the rel- 
ative separations for subsequent recurrences in space (or 
time) form a hierarchy with unexpected properties. All 
of these properties disappear when a history constructed 
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by "shuffling" the original earthquake catalog is analyzed 
using the same method. In that case, almost all results 
agree with predictions of the acausal null model. On the 
basis of these results, we argue that the particular fea- 
tures where we observe strong deviations between the ac- 
tual history and the acausal null model can be attributed 
to causal structures in the dynamics of seismicity. We 
end with a summary and outlook for future works and 
applications. 



II. SYNTHESIZING THE NETWORK OF 
RECURRENCES 

Consider a series of events a,, with i = 1 . . . N, that 
are ordered in time such that event precedes event 
a,j if i < j. The events aj are in the following identi- 
fied with their spatiotemporal position. We assume that 
a metric is defined in space, and we denote by dy the 
spatial distance between events and a,j. Simple ex- 
amples are spatiotemporal point processes taking place 
in 3-dimensional Euclidean space or on the surface of a 
sphere. The only property of the metric relevant to this 
discussion is that (spatial) distances between all pairs of 
events can be ordered, e.g. from smallest to largest, and 
the ordering relation is transitive. The same is of course 
true for time distances. 

The network of recurrent events is defined as follows 
(see Fig. Q}: All events en are represented as nodes and 
two nodes i and j with i < j are connected by a directed 
link or edge ey if event aj is a recurrence of a,. This 
occurs if and only if dy < dik for all fc with i < k < j . 
Thus a recurrence is a new record with respect to dis- 
tance. Note that ey and eji cannot both exist since the 
directionality of links is determined by the time ordering. 
Hence, if i < j only ey can exist. To summarize: the def- 
inition of recurrence implies that, for all 2 < j < n, event 
a,j is automatically a recurrence of event cy_i and, thus, 
all links eu—Uj exist. Event cy is also a recurrence of 
any previous event a* if it is closer to than every other 
event a.k that occurred in between the two, i.e., for all a,k 
with i < k < j. 

As long as only one event occurs at a time, the di- 
rected network consists of a single cluster in which each 
node is linked to at least one other node. Each node i 
has an in-degree fc™, which is the number of links point- 
ing to it from events in its past, as well as an out-degree 
k° ut , which is the number of links emanating from i - 
corresponding to the number of records of event i. The 
collection of in-nodes I\ = {j | eyj exists} are hypothe- 
sized to reflect the potential cause(s) of event a, while the 
set of out-nodes Oi — {j \ eij exists} are hypothesized 
to contain the effect (s) of <ij. Although it is natural to 
contemplate associating a weight factor to each link, this 
requires further assumptions. Here we do not deal with 
this issue and consider all links to have the same weight. 
This is in our view a "zeroth order" assignment of causes 
and their effects based purely on the history of events and 




FIG. 1: Eight events in 2D space labelled according to their 
order of occurrence in time. The network of recurrences is 
indicated by arrows as described in the text. 



their relationships to each other in space and time. Note 
that a single event can have many causes correspond- 
ing to all of its incoming links, so the network aspect of 
causal relations is not lost in this limit. Weighted net- 
works of seismic events were constructed using a different 
methodology in Refs. [HIIUll- 

While this network construction, based on record 
breaking events, is directly applicable to fixed collections 
of events, it can also be applied when the number of 
events N increases over time. The result of adding a new 
event oln+i is to increase the number of links by at least 
one, namely ejv(iv-i-i)> without altering any pre-existing 
links. Hence, the property of being a recurrence is pre- 
served in all cases under addition of new nodes in time. 
Also the collections of in-nodes for all pre-existing nodes 
remains unchanged. Yet, the out-degree of any node i 
with i < N + 1 can increase by one, namely if a^+i is 
a recurrence of dj. So the networks are , in this sense, 
dynamically stable growing networks [27ll28l|. 

Some tools and measures already exist to quantify sta- 
tistical topological features of networks, and to reveal the 
organization of the dynamical process(es) giving rise to 
the events in terms of network statistics [23, [28j]. The 
dependence of the network statistics can be examined 
by varying the time span of the history synthesized into 
a network, space window over which the history is ob- 
served, and/or selection criteria for what is defined as 
an event (in the seismic application discussed later, this 
could e.g. be the range of earthquake magnitudes). Our 
approach opens up a new view of dynamical organiza- 
tion of spatiotemporal activity in terms of the (static) 
topolo gy of complex networks - as was also discussed 
in [Hill, [H, Ha, HJ. We also believe it possible that 
new developments in network theory may turn out to be 
even more powerful in analyzing dynamical systems. For 
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the work described here, standard methods of network 
analysis are already sufficient to plausibly infer certain 
causal relations in seismic behavior solely from the cat- 
alog of earthquake magnitudes, epicenter locations and 
times. 



III. THE ACAUSAL NULL MODEL AND A 
THEORY OF RECORDS 

A. General remarks 

In order to be able to associate causal characteristics 
of the dynamics to the network of recurrences, we math- 
ematically establish statistical properties of a null model, 
where the events in space and time are random, uncor- 
rected and causally unrelated. Then any statistically 
significant deviation of the observed network from this 
null hypothesis can be attributed to correlations among 
events and to causal structure in the underlying dynam- 
ics giving rise to the observed history. The conclusions 
about the relation to causality are robust as long as the 
relevant properties of any acausal null model are well 
represented by those we study. 

In the following we shall discuss several variants of the 
null model. In all of them, both space and time are con- 
tinuous. To the best of our knowledge, the theory of 
records has up to now been developed only for discrete 
time and continuous space [2l|, [22|, [23[ . As we shall see, 
when both variables are continuous the core of the theory 
becomes symmetric under exchange of space and time, al- 
lowing for a more concise formulation. This symmetry is 
obviously lost when making one of the variables discrete. 

Let us denote by p(xi, t\\ . . . x n , f n ) the joint prob- 
ability density for having events at locations (x,,tj), 
i = 1, . . . n. Our basic assumptions are that: 

(a) Events are independent and identically distributed 
(rid), 

n 

p„(xi,ii; . . .x n ,t n ) = ^Qp^Xj,^). (1) 

i=i 

(b) The single-event distributions factorizc, 

Pl (xi,t) = p x (x)p t (t). (2) 

In particular, when pt(t) = const, Eq. @ means that we 
have a stationary system. Note that J dxdtpi(x,t) = N, 
the total average number of events in the history, as long 
as this number is finite. 

Instead of event distributions themselves, we shall in 
the following use the distributions of space-time distances 
relative to some reference event or "Event-0" at (xo,io), 

n p 

p„(h,ti]...l n ,t n ) =Y\[ dyiS(\xQ-yi\-k)]x (3) 
Pn (xq , to; yi , to + t% ; • • • Yn , to +t n ). 



It is easily seen that these joint distributions also factor- 
ize under the above assumptions as, 

n 

p n (h,ti] . . .l ni t n ) = JjA*i(Z i; ti) (4) 

i=l 

with 

p Jl {l,t)=iii{l)nt(t). (5) 

The functions pi{l) and pt(t) might in general depend on 
the reference point, xo. We will not indicate this depen- 
dence explicitly, unless it is relevant for the calculation. 

First, we consider the special case pi(l) = p t (t) = 1, 
which holds if the system is stationary, 1-dimensional, 
homogeneous, and has the suitable space-time density 
of events. The next step is when either one of these 
functions or both are equal to one up to finite cut-offs and 
zero beyond, i.e. pi(l) — Q(X—l) and/or pt(t) — 0(cr— t). 
Physically, A is not only the maximal possible distance 
between two events (due to finiteness of space), but it 
is also the rate at which events occur per unit time, if 
Pt — Similarly, a finite value of a indicates not only 
that events are observed in a finite time window, but 
also that the average number of events per unit distance 
is finite. 



B. Canonical coordinates 

Fortunately, it is sufficient to discuss these simple 
cases, because for any non-singular densities pi{l) and 
Pt(t) the problem can be reduced to one of them by a 
change of coordinates. Consider the two transformations 

£= / dl'pS') , t = [ dt'p t {t') . (6) 
Jo Jo 

Clearly, £ is a positive and monotonically increasing func- 
tion of /, while r is a positive and monotonically increas- 
ing function of t . Due to conservation of probability, both 
have unit density 

^(0 = e(A-o, nr(T) = e(a-T), (7) 

where we have denoted by A and a the integrals over pi 
and p t , respectively, 

/>oo />OC 

A= / dl'mil 1 ) , a= \ dt'p t {t'). (8) 
Jo Jo 

Thus, the distributions of events in £ and r are cut-off 
sharply at A and a, respectively. Note that A and a can 
be infinite. 

Thus, for general space and time distributions, we can 
first do all calculations in the "canonical coordinates" £ 
and t, and then translate the results, using inverse trans- 
formations of Eq. ([6]), back to the original coordinates 
l,t. Examples are given below. In the following we al- 
ways assume that £ and r are defined by Eq. ^ and, 
thus, Eq. holds for all positive £ and t. 
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FIG. 2: A typical chain of recurrences in canonical coordi- 
nates. The reference event or Event-0 for all these recurrences 
is at the origin r = £ = 0. The event at Ti) is a recurrence 
of the event at (0,0) if and only if no event is in the shaded 
region. 



In canonical coordinates, a typical sequence of recur- 
rences is drawn schematically in Fig. [2j For all recur- 
rences i, < £j and 73+1 > Tj. This is symmetric 
under the exchange £ «-> r, A <-> cr , and i <-> — i. The 
probability that a given event (£, r) is a recurrence of 
Event-0 at (0,0) is equal to the chance that no event oc- 
curred in the rectangular region [0,t] x [0, £], which is 
equal to exp(— £t) due to the unit space-time density of 
events in the (£, r)-plane. Hence, the joint probability 
density function (PDF) of recurrences is given by the 
same exponential, 



(9) 



except for the possible cut-offs at A and/or a, beyond 
which the density of recurrences is zero; p(£ > A, r) = 

p(£,T > cr) = 0. 



C. Infinite space and time domains 

For a detailed discussion of the spatial and temporal 
distributions of recurrences we deal separately with the 
cases of finite and infinite A and/or a. We first consider 
the case where neither nor Ht(t) is normalizable, 

i.e. A = = 00. This, for example, describes the case 
of stationary and homogeneous systems in infinite D- 
dimensional Euclidean space, where Ht(t) = const and 
oc l ^ 1 . But it holds also approximately for fractal 
distributions in space (if we neglect effects of lacunarity 
[3l|). with D being the fractal dimension. Notice that 
/ °° dll ^ 1 =00 for all values of D. 

The spatial and temporal density distributions of re- 
currences in canonical coordinates are obtained by inte- 
grating Eq. to obtain the marginals, 



(10) 



Pr(r) = / d£ P (£,T) = l/T. 



(11) 



Assuming that the system is translationally invariant in 
time and fractal in space, i.e. p>t(~t) = b = const and 
/!;(/) = aDl D ~ l , we obtain for the densities in the origi- 
nal coordinates 



Pi(l) = W(0P€(£(0) 



aDl 



D-l 



al D 



D/l , 



Pt{t) = Mt)Pr(T(t)) = bibt]- 1 = l/t 



(12) 
(13) 



Thus the recurrence density in time is independent of 
the event rate (per unit space-time region). Similarly, for 
an event distribution with given (fractal or Euclidean) 
non-trivial dimension, the recurrence density depends on 
the dimension but not on the parameter a. Also, notice 
that pt (t) is completely independent of the spatial event 
distribution ni{l), and pi(l) is independent of fit(t). 

For homogeneous and mono-fractal stationary spatial 
distributions both p t and pi are independent of the ref- 
erence point defining the recurrences. This is no longer 
true for multifractals, where pi(l) depends on the local 
(point-wise) dimension at the event which defines the re- 
currences. 

The analog of Eq. (fT3)) for discrete time is a classic 
result in the theory of record s [2ll I22I |23| . In contrast, 
Eq. ([12"]) was first reported in [29|], as far as we know. 



D. Finite space and infinite time — and vice versa 

Let us assume that p,t(t) is not normalizable but 



A < 00 , 



(14) 



Now, of course, p$(£) = for £ > A. For £ < A, on the 
other hand, p%(£) is still given by integrating exp(— £t) 
over all positive values of r as in Eq. (|10p. i.e. 



p«(£) = |0(A-£)- 

In terms of the original coordinates, one finds 

w(0 



Pl (0 = «(i)p*(£(0) 



(15) 



(16) 



In contrast, p T {r) is obtained by integrating Eq. ([9]) 
over the finite domain < £ < A, which gives 



p T (r) = -(l- e -n 



(17) 



In the stationary case, when t is just proportional to 
r, the density of recurrences in t is given by the same 
formula with A replaced by the rate of events per unit t. 
The additional term compared with Eqs. (TlTj) and (fT5]) 
reflects the probability that no recurrence occurs up to 
time r and t, respectively. 
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In the opposite case (A < oo, a = oo) of finite event 
rate per unit distance and infinite rate per unit time (cor- 
responding typically to infinite space and finite time, with 
finite space time density of events) , the situation is com- 
pletely symmetric. In that case p t {t) is cut-off sharply at 
a finite value, while p$(£) is cut-off with an exponential 
correction term as in Eq. (|17p . 



E. Finite space and finite time 

Now both p£(£) and p T (j) are obtained by integrating 
Eq. © over finite domains, 



pc(0 = ie(A-0(i-e-*), 



1 



p T (r) = -e(a-T)(l-e- AT ). 

T 



(18) 
(19) 



Thus, p^(C) asymptotically approaches the constant 
a in the limit £ — > while for intermediate arguments 
we recover the I /£ decay for infinite space and time do- 
mains given in Eq. (|10p . For large arguments, the density 
sharply drops to zero at £ = A. p T {j) asymptotically ap- 
proaches the constant A in the limit r — > while for 
intermediate arguments we recover the 1/r decay for in- 
finite space and time domains given in Eq. (jlip . For large 
arguments, the density sharply drops to zero at r = a. 
The respective transition points £* and r* between the 
constant behavior for small arguments and the decaying 
behavior for intermediate arguments can be defined in 
the standard way by requiring that the argument of the 
exponential equals —I, i.e., 



aC = 1, 
At* ee I. 



(20) 
(21) 



Specific realizations of such a process include station- 
ary systems observed over a finite time window, where 
events occur only in a finite region of space — or are 
only recorded when they fall into that region. One ex- 
ample is nt(t) = bO(T - t) and m = aDl D - 1 0(R - I) 
with positive constants T and R. In this case, Eq. I|18p 
translates into 



w(0 




for I < P(T), I < R, 
for I > l*(T), I < R, 
for I > R , 



(22) 



and Eq. (flU|) translates into 



Pt(t) 




for t < t*(L), t<T, 
for t > t*(L), t<T , 
for t > T , 



with 



and 



1*{T) ee {abT)- l ' D 
t*(L) ee [abR ]- 1 . 



(23) 

(24) 
(25) 



Finally, let (N) = abTR D be the average total number of 
observed events. Then the expressions for the transition 
points are particularly simple 



l*(N) 
t*(N) 



L/{N) l ' D , 
T/(N). 



(26) 
(27) 



In this simple example and in the situations discussed 
in subsection IIII Dl we have assumed that translations! 
invariancc holds. However, this is generally not true. 
Specific realizations of such processes include stationary 
systems observed over a fixed finite time window, where 
events occur only in a fixed finite region of space — or 
are only recorded when they fall into that region. Due to 
the lack of translational invariance, the distributions of 
distances (spatial and temporal) between events depend 
on the defining event. We discuss the consequences of 
broken translational invariance now. 

For concrctcncss and simplicity, let us assume a sta- 
tionary system where events occur uniformly on an in- 
terval < x < L with periodic boundary conditions, 
with space-time density a. They are recorded only in 
the time window < t < T . In general, the distributions 
of distances between events in a bounded space-time re- 
gion depend on the reference point (xa,to), but in the 
present case this simplifies due to the periodic bound- 
ary condition: The recurrence distributions depend on 
to, but not on xq. More precisely, 



Hi(l;xo,h) = 2aG(L/2-l), 
(H(t;x ,t ) = Q(T -t -t), 



(28) 



for positive arguments t and /, respectively. Note that the 
asymmetrical attribution of the factor a to /i/ is arbitrary. 

This ansatz gives a — (T — to) and A = aL. The 
relations between original and canonical coordinates are 



r(t) 



2al 
aL 



t 

T-t 



0<l<L/2 
l>L/2, 



< t < T - to 
t > T - t Q . 



(29) 



(30) 



The recurrence PDFs are obtained by inserting this 
into Eqs. (| 181 and transforming back to the original 
coordinates. The average distributions of distances be- 
tween recurrences and reference points are obtained by 
averaging over to- The final results are 



(Pi(0>to = y 



l - 



1 - e 



-2q/T 



2alT 



G(L/2-l), (31) 



(Pt(t))t 



(l- e -«")(l_-)0(T-i), (32) 



These detailed results are included in order to demon- 
strate that exact calculations are possible in the most 
simple case. But in more realistic cases no exact results 
can be expected. As a general rule, the simple power 
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laws of Eqs. (|12ll3p will hold for intermediate values of 
I and t, but corrections will be necessary both for large 
and for small / and t — as follows from Eqs. (|18|19() . The 
corrections render the distributions finite at small values 
of the arguments, and they cut them off at large ones. 

The cut-offs at large I and t occur just at the sizes of the 
system. Their detailed shapes depend, as suggested by 
comparing Eqs. (|31l32p with Eqs. (|22l23p . on the specific 
properties of the system at large scales. The behavior at 
small distances is more general. 

To see this, let us consider Eq. (f3"2"| in more detail. 
There the deviation from the infinite system limit hap- 
pens when atL ss 1, i.e. at a time 



t w t*(L) = (aL)~ 



(33) 



which exactly coincides with Eq. (|25p for the transla- 
tional invariant case. Since a is the density of events in 
space-time, t* is the average time delay between succes- 
sive events. Obviously, recurrences cannot follow each 
other faster than events. Similarly in Eq. (|3ip . the 
deviation from the infinite system limit happens when 
2alT w 1, which coincides with the expression for l*(T) 
in the translational invariant case given by Eq. (|24p . 

Not only is the scaling of I* and t* identical to the 
translationally invariant case but also the qualitative be- 
haviors of (pi(l))t and of (pt{t))t a for I -C I* and t<^t*, 
respectively, are identical. This strongly suggests that 
the results given in Eqs. (|22I23I24I25|) capture the essen- 
tial behavior for scales smaller than the large scale cut-off 
- even when translational invariance is explicitly broken. 



F. Correlations between recurrences and 
properties of recurrences with fixed rank 

Let p(l,t;l',t') be the PDF that two events at space- 
time positions (l,t) and (l',f) are both records - not 
necessarily subsequent ones. Referring to Fig. [2J and 
using Bayes' theorem in canonical coordinates, we are 
interested in the probability that no other event occurs 
in either of the two rectangles associated to the events, 
which is determined by the union of the two rectangular 
areas. Hence if t > r' and £ < then 

r; = e-^'-^'e(a - r)6(A - £') . (34) 

This directly determines p(l, t; I', t'). 

Integrating over £ and £' gives the joint PDF for having 
recurrences at times r and r', 



p(r, t') 



a /•£' 
d£ / d£ e~« r -(« 0(ff - r) 
ii Jo 



1 
tt' 



1 



Te -\r' _ r ' e -Ar 



6(cr - t) (35) 



For A = oo, this gives p(r, t') — {tt') l , for t' < r < a 
so the two recurrences are uncorrelated. For finite A, 



records are correlated; i.e. p(t,t') ^ p r (r)p r (r'). For 
a stationary process, these results hold in the original 
coordinate t as well. 

Alternatively, let q(l,t;l',t') be the probability that 
two events at (l,t) and {l',t') are successive records. As- 
suming again that t > r' and £ < we now demand 
that both are records, as above, and also that no other 
event happens in the rectangle [£',£] x [t',t], or 



^,r;£',r') = e-«' T e(A-£')0(^-T) 



(36) 



Integrating over £ and £' gives the joint PDF for having 
successive records at times r and r' to be 



A 

q(T,T>) = I d? [ rf£ e-« V 0(<7 - r) 
Jo Jo 



(37) 



= — [1 - (1 + Ar)e- Ar ] for r' < r < a . 

Hence, times to successive recurrences are always corre- 
lated. When A = oo, the joint PDF is q(r, r') = 1/r 2 for 
t 1 < t < a. 

For the PDF of the ratio of the times of successive 
records x = t'/t > 0, it directly follows for finite A that 



q T (x) 



dT'q{T{x),T') 



(h_ 

d.v 



6(1 



(38) 



which is constant in the interval [0; 1]. This is also the re- 
sult in the original coordinates, if the system is stationary 
- in which case t oc t and x = t'/t. 

We now discuss spatial distance distributions of recur- 
rences with fixed rank i, and first consider a finite sta- 
tionary system infinitely extended in time (<r = oo). Let 

p^(£) be the spatial distance PDF for the z-th recurrence 
following Event-0. For any i > 2, the recursion relation 

" de'«(elOP? _1) «'). (39) 



exists. The quantity is the conditional PDF, given 

that the previous recurrence happened at distance for 
the distance of the next recurrence. One easily shows 
that 



Q®0 = ^W-t) 



independently of i, so that 



P e (t) = J -TfPl U) 



The solution for finite A is 
P? } (0 



w^_e(A-o 
(i-iy. 
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(40) 



(41) 



(42) 
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If the event density in original coordinates was /k(Z) = 
aDl D ~ 1 Q(R — l)/R D , i.e., confined to a disc with radius 
R, then the last equation translates into 



P, w (0 = aQ(R-l) D ' j 



D-l 



(In R/lf 



(43) 



(t- l)!i? D 

while Eq. (gDJ) gives for the PDF of the ratio x = l/V > 

qi(x) =Dx D - 1 e(l- x) . (44) 

These last results have to be modified when a < oo, 
i.e. when there is a finite observation window in time. 
In that case we are not guaranteed that at least i recur- 
rences exist, and thus p\ l \l) has to be replaced by the 
conditional PDF, conditioned on the existence of > i re- 
currences. That requires a more extensive development 
than we take up here. 



G. Distribution of the number of recurrences — or 
the degree distributions 

The out-degree distribution P out (k,N) is the proba- 
bility that a randomly chosen event out of a sequence of 
N events has k records. This probability can be deduced 
using previous results from the theory of records [2l|, [22|, 
HI, [12, HI]. We assume that the system is stationary, 
with a finite rate of events per unit time. We denote 
the event defining recurrences as Event-0. We use the 
fact that recurrences are records in the sense that each 
recurrence is an event that is closer to Event-0 than all 
previous events that happened after Event-0. Consider 
a series of i events following Event-0. The probability 
that event j is a record is l/j and the probability that 
it is not is (j — Hence the probability that there 

is precisely one record in a series of i events following 
Event-0 is = JXj^U - l )H = l / L Notice that the 

first event after Event-0 is always a record. The proba- 
bility that there are precisely two records in the series of 
i events is 



^(2)=(n 



3 -1 

j 



h 1\ 1 1 



3=2 J l 1= 2 



y — *- ) = t Y— 



h=2 



(45) 

Continuing with standard methods it is possible to show 
that the probability of finding precisely k records in a 
series of i events, Pi(k), is given by 



Pi 



(*) = 7 E 



i 



KIi<--<It-iS' 

i\ 

(]ni) k - 1 
i(k-l)\ ' 



(46) 



where the symbol S indicates Stirling's number of the 
first kind and the last expression holds for i > i; > 1. 



Considering that each event except the last one in the 
sequence of N events initiates its own sequence of records, 
and hence is an Event-0, gives 



(ln(iV)) A 
N k\ 



(47) 



where the last step involves approximating the sum as 
an integral, which is valid for large TV. Therefore, the 
out-degree distribution for a random process of N 3> 1 
events is a Poisson distribution with mean degree (k) w 
In AT [H. 

Furthermore, the probability to have out-degree one, 
P out (l,N), can be computed exactly [UHl]: For those 
nodes the closest event in space is also the closest in time. 
For event i, this happens with probability 1/(AT — i). 
Thus, 



P out (l,N) = N 



N-l 
-1 \ ^ 



1 



^ N-i 

i=l 



ln(AQ + e M 
N 



(48) 



where we have approximated the harmonic series by 
the corresponding integral and eM ~ 0.58 is the Euler- 
Mascheroni constant. Note that Eq. ([48]) is exact in the 
limit N — * oo. 

For the in-degree distribution, P ln (k,N), similar con- 
siderations apply: Event i is a recurrence of event j 
(0 < j < i) with probability — j), which is inde- 
pendent of N. This allows to compute the in-degree dis- 
tribution P* n (k) of event i: 



Pt{k) 



i — k i — fe+1 

E ■ 

; 1= o i 2 =h+i 
\S*\ 



1 i — k 



1 



i-l 



(ln(z)) 



k-l 



i{k-l)\ 

for < k < i — 1 and zero otherwise. Hence 

N-l 

N 



N-l 

p-(fc,iv) = -E^ m (fc) 



(ln(iV)) fe 



N k\ 



(49) 



(50) 



As expected for a fully random process, P m (k, N) is iden- 
tical to P out (k,N) and well- approximated by a Poisson 
distribution with mean degree (k) w In AT for N 3> 1. 



H. Degree correlations 

Due to the acausal nature of the null model, the joint 
probability Pi(k m , k out ) that event i has in-degree k m 
and out-degree k out factors for all nodes i. As a result 



Pi(k m ,k out ) 



P N -i(k owt )P? 
(In (N -i)) k ° 



(fc m ) (51) 
(In (i))**"- 1 



(AT - i)(k out - 1)! i(k m - 1)! 
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This allows us to compute the mean out-degree of all 
events with a given in-degree in a sequence of N events 



The out-degree (k out ) weakly depends on k m due to the 
fact that the rank of each event implicitly couples its 
in- and out-degree in a finite sequence of events. For 
instance, if the rank of an event is small (large) com- 
pared to N, the in-degree is more likely to be small, but 
the out-degree is more likely to be large. Consequently, 
(k out )(k m , N) decreases with k m for fixed N. For sim- 
ilar reasons, weak correlations also appear between the 
in-/out-degree of a node and the in-/out-degree of its re- 
currences. For example, a large (small) in-degree for a 
node implies on average a small (large) out-degree for 
its recurrences. Similarly, the out-degree (in-degree) of 
recurrences increases on average with the out-degree (in- 
degree) of their Event-0. 

IV. APPLICATION TO SEISMIC PATTERNS 

Seismicity is a prime example where localized events 
in space and time can be accurately and, with certain 
caveats, exhaustively recorded. It is also a phenomenon 
where the causal features of the dynamics responsible for 
the patterns are subject to ongoing debate and uncer- 
tainty. Seismic data involving many earthquakes occur- 
ring over large regions of space and time exhibit a num- 
ber of regularities. These include clustering, fault traces 
and epicenter locations with fractal statistics, as well as 
scaling laws like the Omori and Gutenberg-Richter (GR) 
laws (see e.g. Refs. 00, f° r a review). Given that 
the associated earthquake patterns in space and time are 
readily observable, approaches based on the concept of 
spatiotemporal point processes have been amply demon- 
strated to be feasible 0, [H, [H, H3, [H, . In that case, 
the description of seismicity is reduced to recording the 
size or magnitude of each earthquake, its epicenter and 
its time of occurrence. 

To test the suitability of our method to character- 
ize seismicity in a way that makes it possible to infer 
relevant causal features of its dynamics and to extend 
our earlier analysis 0], we study a "relocated" earth- 
quake catalog from Southern California [24(. The cata- 
log has improved relative location accuracy within clus- 
ters of similar events, the estimated horizontal standard 
errors being typically less than 50 to 100m and the esti- 
mated vertical standard errors being typically less than 
100 to 200m 0. Due to the higher relative and 
absolute location errors for the depth of an earthquake, 
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FIG. 3: (Color online) Spatial pattern of seismicity in South- 
ern California [24|], as described in the text. 



we only consider epicenters in the following. The cat- 
alog is assumed to be homogeneous from January 1984 
to December 2002 and complete for events with mag- 
nitude larger than m c — 2.5 located within the rect- 
angle (120.5°W, 115.0°W) x (32.5°iV, 36.0°iV) 0. Re- 
stricting ourselves to magnitudes larger than m c gives 
N = 22217 events (see Fig. [3]). In order to test for robust- 
ness and the dependence on magnitude, we analyze this 
sub-catalog and subsets of it, obtained in two different 
ways: By (a) selecting different threshold magnitudes, 
namely m = 3.0,3.5,4.0 giving N = 5857, 1770 and 577 
events, respectively, or (b) using a shorter period from 
January 1984 to December 1987 giving N = 4744 events 
for magnitude threshold m = m c . 

It is important to note that all events in the catalog are 
treated in the same way. In particular, we do not distin- 
guish between foreshocks, mainshocks and aftershocks. 
Hence, our definition of a recurrence - an event is a re- 
currence of any previous event if it is closer to it in space 
than all the intervening events - is a priori independent 
of those classifications. Note also that our definition of a 
recurrence is wholly unrelated to the notion of "charac- 
teristic earthquakes" on a single fault as introduced, for 
example, in Refs. 0,00. 

Fig. 3] shows the recurrences with magnitude m > m c 
defined by our method for one randomly chosen event 
in the catalog, an earthquake of magnitude 2.9 that oc- 



I 

(k out )(k in ,N)= V r t -yp,(p,r i )/p i "(r,]v)«i + — — — / s^lui — 

\ A ' > iV 4-^ (In (N — l)) fe y x N-l-i 




latitude (°) 
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FIG. 4: (Color online) Map showing a 2.9 earthquake (black) 
and its recurrences as defined by our method. The size of the 
symbols linearly scales with the magnitude of the event and 
its color corresponds to the time of occurrence (from darker 
colors to lighter colors). See Table [T] for more details. 



TABLE I: List of recurrences of the 2.9 earthquake given in 
Fig.|4]as defined by our method for threshold magnitude m = 
2.5. 



rank 


magnitude 


I (km) 


T(h) 


1 


2.5 


234.36 


22.16 


2 


2.5 


87.39 


42.81 


3 


2.7 


84.98 


198.87 


4 


2.5 


84.34 


232.94 


5 


2.6 


83.97 


236.56 


6 


3.0 


73.99 


296.51 


7 


2.8 


72.80 


424.95 


8 


3.3 


26.37 


961.64 


9 


2.5 


13.38 


3471.73 


10 


2.9 


6.99 


3482.97 


11 


2.6 


5.31 


25452.30 



curred on January 10, 1999. The actual spatial and tem- 
poral distance between this event and each of its recur- 
rences is listed in Table HI It has to be noted that the 
number of recurrences of a given earthquake or Event-0 
is generally not related to its magnitude. The number 
of recurrences of the largest earthquakes like the Lan- 
ders event or the Hector mine event are just above the 
average (see Section riVB[) . Thus, most recurrences are 
associated to Event-Os with small magnitude — which 
are much more abundant according to the Gutenberg- 
Richter law. 



A. Spatial distances of recurrences 

Fig- HI shows the estimated PDF p m (l) of recurrences 
at a spatial distance I in the sub-catalog with threshold 
magnitude m. The PDFs exhibit a peak at a typical 
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FIG. 5: (Color online) Distribution of distances I of recur- 
rent events for sets with different magnitude thresholds m. 
The distribution for m = 2.5 up to 1988 is also shown and 
is almost indistinguishable from the data for the full catalog 
with m — 2.5 - showing the invariance of the distribution 
with respect to the time span of the recorded events. Filled 
symbols correspond to distances below 100 m and are unreli- 
able due to location errors. The inset shows a data collapse, 
obtained by rescaling distances and distributions according 
to Eq. 1|53[1 (excluding unreliable points). The full straight 
line has slope 2.05; the vertical dashed line indicates the pre- 
factor Lo in the scaling law for the characteristic distance, 
P(m) = L x l0°- 45m . Note that p m (l)dl = 1. 



distance, l*(m), which increases with magnitude. For 
sufficiently large I, all distributions show a power law 
decay with an exponent ~ 1.05 up to a cutoff. This 
cutoff corresponds to the size of the region in Southern 
California that we consider, and hence is a finite size 
effect. For small distances I < l*(m), we observe an 
approximately linear increase. 

With a suitable scaling ansatz, the different curves in 
Fig. [5] fall onto a universal curve, except at the finite size 
cutoff. The inset in Fig. Oshows results of a data collapse 
using 



p m (i) ^r lm F(l/10 OA5m ) . 



(53) 



The scaling function F has two regimes, a power-law in- 
crease with exponent « 2.05 for small arguments and a 
constant regime at large arguments. The transition point 
between the two regimes can be estimated by extrapo- 
lating them and selecting the intersection point, giving 
Lq = 0.012km. For the characteristic distance that ap- 
pears in F we find 



0.45m 



P(m) ps L x 10' 



Discovery of Causal Structure 



(54) 



Although p m (l) has the same overall shape as the dis- 
tribution p(l) of the finite null model (see Eq. (|2"2")) L there 
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are fundamental differences with respect to the depen- 
dence on the time span over which events are recorded. 
For the earthquake data, p m (l) and in particular l*(m) 
do not depend on the time span at all but rather depend 
directly on to. This conclusion comes from the explicit 
comparison of two different observation periods in Fig. [5] 
with the same m. With the exception of the smallest val- 
ues of I, p 2 ' 5 (l) is largely unaltered if only the sub-catalog 
up to 1988 is analyzed and /* does not change at all. It 
is important to note that the total number of events in 
the latter sub-catalog is roughly 5 times smaller. 

In the null model I* depends explicitly on the finite 
time span of the observation period, T, as shown in 
Eq. (j2"4")) . In the real data though, the spatiotemporal 
ordering of earthquakes determines the value of I*, re- 
gardless of the duration of the observation period - as 
long as it is large enough to obtain sufficient statistics 
to determine l*(m) and small enough that seismic cor- 
relations do not disappear over that time span. This is 
confirmed by analyses of other sub-catalogs (not shown) . 
On this basis, we conclude that the characteristic length 
must therefore reflect robust physical properties of the 
underlying dynamics over the given observation periods. 
Its (quasi)-invariance is not a property of the null model. 
Therefore, it reflects causal structure in the dynamics 
of seismicity. As a result, if one re-arranges the seis- 
mic catalog by "shuffling" the locations and magnitudes 
of events (see Section TlVB lj) . then the invariance of I* 
is lost and the distribution of recurrences behaves the 
same as the null model for spatial dimension D — 2 (see 
Eqs. (|22l24p ). To sum up: the invariance of l*(m) is an 
indicator of causality and is thereby a physically mean- 
ingful length scale in the dynamics of seismicity over the 
time scales we can explore with statistical methods - min- 
utes to decades. 



2. Identification with the Rupture Length 

The almost complete lack of dependence of p m {l) (ex- 
cluding very small values of I) on the considered time 
span can be explained by at least two scenarios: 1) Re- 
currences with I <C l*(m) are greatly suppressed at large 
time scales; 2) Recurrences with I « l*(m) are greatly en- 
hanced at short time scales compared to the null model 
with constant rate. As we will discuss below, it is likely 
that both effects are present. 

Physically, such a behavior is reasonable if we iden- 
tify I* with the rupture length of the earthquake that 
starts a chain of recurrences. As described by Omori's 
law [46| , the rate of seismic activity tends to increase di- 
rectly after an earthquake nearby (close to the rupture 
area of the event). Moreover, there is some evidence that 
due to the stress relief within the rupture area itself, it 
tends to exhibit less seismic activity for awhile — see, 
for example, Ref. This supports the hypothesis that 
activity increases for I as l*(m) at shorter times, but gets 
suppressed for I <C l*(m) over longer times. 



This identification is also affirmed by the fact that the 
scaling of l*(m) with m is close to the estimated be- 
havior of the rupture length Ln(m') w 0.02 x 10 m / 2 km 
given in Ref. [48j and remarkably close to Lr(to') = 
~ 0.018 x 10°' 46m 'km given in Ref. [U, where 
to' is the magnitude of the earthquake and An its rup- 
ture area. The close agreement between the latter and 
Eq. (|54p suggests that the characteristic length scale of 
distances for recurrent events is indeed the rupture length 
of events with to' = to, defined in terms of the rupture 
area I* = Lr = V 'An- Thus, our approach allows us to 
discover the rupture length as a causal consequence of 
the dynamics based purely on the spatiotemporal orga- 
nization of seismicity without any additional knowledge 
of the microscopic dynamics and the actual rupture pro- 
cesses that occur - even, in fact, treating the seismic 
events as point-like in space and time! 

The identification I* = Lr is also consistent with the 
fact that the description of earthquakes as a point process 
breaks down below the rupture length. Then, the rele- 
vant distance(s) between earthquakes is not determined 
solely by their epicenter positions but also by the relative 
orientation and size of the extended ruptures in 3D space. 
Thus, we expect to find a different correlation structure 
for distances smaller than the rupture length. In fact, 
this is precisely what our data show, namely a linear in- 
crease at small distances, I -C l*(m) (see the main part 
of Fig. [5] and also the straight line with a slope of 2.05 in 
the inset of Fig. |SJ| . 

3. Robustness ofl*(m) 

The lengths I* observed for the values of to we consider 
are larger than the location errors (« 100m). Simula- 
tions show that p A {l) (blue triangles in Fig. HJ) does not 
change substantially if the epicenters in the catalog are 
randomly relocated by a small distance up to one kilome- 
ter. Yet, the maximum for p 2 - 5 (l) shifts to larger I with 
this procedure, destroying the scaling of l*(m). Since the 
smallest I* that obeys the data collapse is « 160m, the 
data collapse we observe for the original data verifies that 
the relative location errors are indeed less than 100 m, 
or of that order [5(j. Furthermore, the absence of any 
anomaly due to location errors near 100m in Fig. [5] indi- 
cates that recurrences within the rupture area lack corre- 
lations. This is also supported by Eq. (f2"2")) which predicts 
the observed behavior p m (l) cx I for I < I* if events are 
happening uniformly and randomly in 2D space during 
a finite observation period, or are recorded as happening 
randomly in space due to location errors. 

4- Spatial Hierarchy of Subsequent Records 

To further examine the behavior of p m (Z), we study 
separately the contributions of recurrences with definite 
rank. The rank i is defined as in Sec. Ill F, i.e., for a 
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given Event-O, recurrence i + 1 directly follows recurrence 
i as shown in Fig. [5] Since p m (l) is the PDF that any re- 
currence occurs at distance I for a catalog with threshold 
magnitude m, we have for any finite number of events N, 



P m (l) 



N-l 

£ 

1=1 



p rec (i)p?(l) 



(55) 



where p rec (i) is the probability that a randomly chosen 
recurrence is an i'th recurrence, Ni is the number of 
events in the sequence that have at least i recurrences (or 
out-going links), and p" l (l) is the conditional PDF that, 
given that a recurrence is an i'th recurrence, it happens 
at distance I. 
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FIG. 6: (Color online) Distribution of distances I of the first 
recurrence for different magnitude thresholds m. Filled sym- 
bols correspond to distances below 100 m and are unreliable 
due to location errors. Note that J™ p?(l)dl = 1. Inset: 
Data collapse obtained by rescaling distances and distribu- 
tions according to Eq. (|56|) (excluding unreliable points). 



possible distance in the region covered by the catalog. 
By construction these ratios are always between zero and 
one. We denote by q™(x) the PDF that k+i/k = x for 
each event that has an (i + l) th recurrence. As indicated 
in Fig. [3 the data for m = 2.5 and i — (black circles) 
scale over a wide region as q^ 5 (x) ~ x~ Sr with S r fts 0.6. 
This is expected since q™(x) ~ P™(0- Although each 
distribution qf (x) is different, the curves for i > 1 also 
show (more restricted) power law decay comparable to 
<7o' 5 (ir). For k + i/k — > 1 they also exhibit a peak that be- 
comes more pronounced with increasing i. This is due to 
recurrences occurring at almost the same distance (but 
not at the same place!) suggesting again that recurrences 
are suppressed within the rupture area, but are enhanced 
just outside that area. 
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FIG. 7: (Color online) Distribution of recurrence distance ra- 
tios U+i/li for m = 2.5 and different values of i with lo = 634.3 
km. The straight line corresponds to a decay with exponent 
0.6. Note that f£° q?(x)dx = 1. 



In the inset of Fig. [5J the data are analyzed according 
to the ansatz that the distribution of first recurrences, 
Pi l (l), has the scaling form 

v f(l) =r s *F(!/l(P Mm ), (56) 

with 5 r w 0.6 and F similar to F (see Eq. ([53)) and the 
inset of Fig. [5] for comparison). In particular, the same 
characteristic distance l*(m) appears as for p m (l). More- 
over, we find that the latter is true for all p^il) — which 
is further evidence supporting the interpretation of /* as 
the rupture length. The behavior of ?>5™(0 indicated in 
Fig. [5] and described by Eq. (f5rj|) extends earlier results 
for a catalog from Southern California with lower spatial 
resolution (w lkm)which did not allow to resolve the 
dependence on m [35j . 

Related to the distribution of distances for recurrent 
events is the distribution of distance ratios k+i/h of con- 
secutive recurrences. Here again recurrences are ordered 
such that recurrence i + 1 directly follows recurrence i. 
For i = 0, we take Iq = 634.3 km, which is the largest 



The observed behavior of qf' 5 (x) and Pi™(0 is very dif- 
ferent from the behavior predicted by the null model. 
For the null model, in the long time limit, Eq. (144|) 
gives q™(x) = Dx " 1 which is not only independent 
of i but also purely determined by the spatial dimension 
D - and is increasing for D > 1 rather than decreas- 
ing. Similarly, Eq. gives p™(0 x for tnc nuu 
model. For Southern California, it has been found that 
D = D-2 = 1.2 [3^,[3^], which would lead to an increas- 
ing function q™{x) ~ a; 2 rather than a decaying power 
law behavior. Although the above predictions of the null 
model are only strictly true in the infinite time limit, we 
point out that repeating this analysis of the hierarchy of 
recurrences for a "shuffled" catalog reveals behavior in 
close agreement with the null model and diametrically 
opposed to the results shown in Fig. [7] for the actual seis- 
mic record (3r| . 

Thus, the observed behavior of qf' 5 (x) and Px"{l) as 
well as the value of 5 r are not determined by the spatial 
distribution of seismicity alone but reflect causal struc- 
tures leading to the complex spatiotemporal organization 



13 



of seismicity. Moreover, the shape of p" l (l) shows that the 
first recurrence is much more likely to happen at a typical 
distance of I* than predicted by the null model. This en- 
hancement goes along with a suppression of recurrences 
with I <C I* as the increasing (with i) peak at x = 1 for 
q™(x) indicates. These results support the overall picture 
that recurrences with I <gC l*(m) are greatly suppressed 
at large time scales while recurrences with I ps l*(m) are 
greatly enhanced at short time scales. 



B. Network properties 




FIG. 8: (Color online) In- and out-degree histograms for dif- 
ferent values of m. For a given earthquake, the in-degree 
(out-degree) k is the number of links directed at it (originat- 
ing from it) as defined in Section [II] Open (red) symbols cor- 
respond to the in-degree, filled (black) symbols correspond to 
the out-degree. Error bars can be estimated as y/N(k). The 
red lines correspond to Poisson distributions with the same 
respective mean and normalization. 

We now turn to the analysis of seismicity in terms of 
the statistical properties of its network of recurrences 
(or records) as defined in Section [TT] and illustrated in 
Fig. [T] Fig. [8] shows the in- and out-degree histograms 
for different values of to, which are compared to Pois- 
son distributions with the same respective mean de- 
gree and normalization ((k) — 7.40,6.24,5.20,4.35 for 
rn = 2. 5, 3.0, 3. 5, 4.0, respectively). A Poisson out-degree 
and in-degree distribution is expected for the null model 
(see Eq. (|4"T)) and Eq. ([50)) ). For the actual seismic net- 
work, the out-degree distributions are significantly dif- 
ferent from a Poissonian [5lfl . In particular, the network 
keeps a preponderance of nodes with small out-degree as 
well as an excess of nodes with large out-degree compared 
to a Poisson distribution. This effect becomes more pro- 
nounced with increasing magnitude. 

The behavior of the out-degree distribution implies 
that the network topology is able to discern consequences 
of the causal structure of seismicity: The preponderance 



of nodes with small out-degree, for example, can be re- 
lated to the physical picture discussed above that seismic 
activity is typically greatly enhanced directly after the 
occurrence of an earthquake close to its rupture area but 
suppressed within the rupture area itself. Such a dynam- 
ics makes it more likely that only very few recurrences 
occur, even at long times. For the in-degree distributions, 
we find that they roughly agree with a Poisson distribu- 
tion although there are still significant deviations from 
the null model for k = 1 [52|. 

Note, however, that (k) — which is obviously the same 
for the in- and out-degrees — decreases with to, simply 
because the number of events N shrinks with to. This is 
shown in Fig. [H] where (k) is also displayed for a randomly 
shuffled catalog. 



1. Shuffling Procedure 

Shuffling was performed in the following way: Consider 
all events in the catalog with magnitude ml > m c = 2.5. 
Shuffle the magnitudes and the epicenter locations sepa- 
rately, keeping the times of occurrence, and then apply 
the recurrence analysis for the different subsets defined 
by different magnitude thresholds as before. The shuf- 
fled catalog can, thus, be considered as a realization of 
a random process with no spatiotemporal correlations, 
although both spatial correlations and temporal corre- 
lations may persist separately. Based on the null model 
and Eq. (f47|) , we expect a Poisson out-degree distribution 
with (k) ss In N which is exactly what we find for the ran- 
domly shuffled catalog. This dependence of (k) can be 
clearly seen in Fig. [5] Yet, for the original earthquake 
data we find for large N 



(k) re 0.8 In TV. 



(57) 



Hence, the average number of recurrences is significantly 
less than for the null model, which is presumably related 
to the suppression of recurrences with I < l*(m) - as 
discussed earlier. Fig. [5] gives further evidence that re- 
currences emphasize particular aspects of spatiotemporal 
clustering, associated with the causal dynamics of seis- 
micity. 



2. Degree-degree Correlations 

The causal structure of seismicity does not, however, 
induce strong degree-degree correlations between events 
and their recurrences other than those arising from the 
temporal order of a finite sequence of events - as in the 
acausal null model. Panels A to C in Fig. [TUJ show the 
average out-degree and in-degree of recurrences as a func- 
tion of the in-degree or out-degree of their corresponding 
Event-0 (53|. There are no qualitative differences be- 
tween the actual earthquake catalog from California and 
a surrogate, which is a randomly shuffled version of the 
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FIG. 9: (Color online) Mean degree (k) as a function of 
number of events N (or magnitude m). Open symbols cor- 
respond to the original catalog for different values of m, 
while filled symbols correspond to the shuffled catalog (see 
text). The lines correspond to best fits giving (k) orig i na i — 
-1.03 + 0.84 IniVand {k) shuff i ed = -0.47 + 1.01 In AT. 



catalog. In particular, the behavior shown in panel A 
and B agrees with the acausal null model (see discussion 
following Eq. (|52j0. Note that the offset between the two 
data sets is simply due to different (k). 

The situation is different for the dependence of the 
mean out-degree on the in-degree of the same node. As 
shown in Eq. (fB"2")) , (k out ) has a weak dependence on k m in 
the null model such that (k out ) decreases with k m . This 
is exactly what we find for the shuffled catalog as shown 
in panel D of Fig. [TU] However, the same panel also 
shows that for the actual earthquake catalog (k out ) in- 
creases with k m - exactly the opposite of the null model. 
Moreover, k m < (k) implies k out < (k) on average. This 
is again consistent with a causal dynamics where earth- 
quakes are clustered in space and time. 



3. Clustering coefficient 

Other network properties include various measures of 
clustering. In general terms, clustering quantifies how 
well connected the neighbors of a node are among them- 
selves. In the case of recurrences, it refers to the likeli- 
hood that recurrences of the same event are also recur- 
rences of each other. There are different, inequivalent 
definitions of the clustering coefficient C [54]. Here we 
focus on the definition based on the local clustering co- 
efficient d adapted to directed networks. 

For all nodes i with out-degree larger than one, the 
clustering coefficient Cj is given by the ratio of existing 
links Ei between its k° ut recurrences to a possible num- 
ber of such links, \k° ut [k° ut — 1). Then the clustering 
coefficient C of the network is defined as the average over 
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FIG. 10: (Color online) Panels A - C: degree correlations be- 
tween the Event-0 and its recurrences. The average in-degree 
(fc^ra) or out-degree ) of the recurrences of all nodes with 
a given in-degree k m or out-degree k out for m = 2.5 is shown. 
Open (black) circles correspond to the original earthquake 
catalog from California, filled (blue) diamonds correspond to 
the shuffled catalog. The mean degree is indicated by the 
(red) solid line and the (orange) dashed line, respectively. 
Panel D shows the average out-degree of a node as a function 
of its in-degree. Open (green) circles correspond to the origi- 
nal earthquake catalog from California, filled (blue) diamonds 
correspond to the shuffled catalog. The black dash-dotted line 
is the approximation for the null model given in Eq. (152[) . In 
this panel, the behavior of the original earthquake data is 
qualitatively very different from the null model and the shuf- 
fled catalog. 



all nodes i with out-degree larger than one 



C = (Q) 



2Et 



k° ut (k° 



1 



(58) 



This definition implies, for example, that the cluster- 
ing coefficient of an Erdos-Renyi graph is equal to the 
the probability of linking each pair of nodes, pu n k = 
{k)/(N-l) = C rand . 

For the data from California, we obtain C = 0.2647 for 
to = 2.5. This is significantly larger than C = 0.1825, 
which is the value for the shuffled catalog. It has to be 
pointed out, though, that the average is performed over a 
different number of nodes in the two cases since the shuf- 
fled catalog hardly contains any events with out-degree 
equal to one. For the shuffled catalog, there are only 15 
events with k out = 1, which is close to the expected value 
of 10.6 for the random model - see Eq. (|48[) . This value is 
two orders of magnitude less than for the actual seismic 
data. 

Another difference between the two data sets is the 
distribution of Cj. For the actual earthquake data, the 
distribution is much broader. The standard deviation 
for the distribution is 0.2146 compared to 0.0934 for the 
shuffled catalog. This difference is mainly due to the fact 
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that the original data contain many events with C, = 
or C{ = 1 - unlike the shuffled catalog. 



C. Temporal distances of recurrences 




t (hour) 



FIG. 11: (Color online) Distributions of the waiting times 
between Event-Os and their recurrences for the original cat- 
alog and different threshold magnitudes m. The distribu- 
tion for m = 2.5 up to 1988 is also shown. Filled symbols 
correspond to times below 90 seconds which are underesti- 
mated and unreliable due to measurement restrictions: The 
finite rupture times of earthquakes and the associated seismic 
coda, which consists of a superposition of incoherent scattered 
waves, place limitations on the identification and separation 
of earthquakes. The inset shows the rescaled distributions. 
Note that f™p m {t)dt = 1. 

The temporal distances between events and their re- 
currences can be analyzed in the same way as the spatial 
distances. The PDF p m (t) for these waiting (or "inter- 
occurrence") times for different threshold magnitudes m 
is shown in Fig.[TlJ These all decay roughly as l/t a with 
a w 0.9 for intermediate times as indicated in the in- 
set. The apparent scaling region in Fig. [IT] shows some 
curvature, though. Due to the finite duration of the cat- 
alog, there is an observational cut-off at the longest time 
scales. At the shortest time scales, p m (t) goes over to 
a constant limit. While the shape of the distribution is 
roughly similar to the null model (see Eq. (|2"5)) ). p m (t) for 
the earthquake catalog is independent of m and, hence, 
the number of events in the catalog. This invariance is 
(again) drastically at odds with the null model where the 
temporal rate A = abR D determines the transition point 
and A itself depends on the number of events N as shown 
in Eqs. ([251271) . 

As described in what follows, the analysis for the 
shuffled catalog shown in Fig. [T^J is consistent with the 
acausal null model. As predicted by the null model, the 
distributions for the shuffled catalog must be rescaled by 
the rate of events in order to obtain a data collapse. Fur- 
thermore, the invariant behavior (with respect to mag- 



nitude m) we observe for recurrences in the original cat- 
alog differs substantially from earlier results for wait- 
ing time distributions between subsequent earthquakes 
[3a . l37l l38l [391 ] . It reflects a new non-trivial feature of 
the spatiotemporal dynamics of seismicity that appears 
when events other than the immediately subsequent ones 
- used to conventionally define waiting times - are con- 
sidered. 




10" 6 10" 4 10" 2 10° l(f I0 4 



in 1 " l™ LLLUini. L_l_LUm. L^jji™ 1 

1U _4 .2 2 4 fi 

10 10 10 10 10 10 

t (hour) 

FIG. 12: (Color online) Distributions of the waiting times be- 
tween Event-Os and their recurrences in the shuffled catalog 
(see text) for different threshold magnitudes m. Filled sym- 
bols correspond to times below 90 seconds which are under- 
estimated and unreliable. The inset shows the distributions 
rescaled by the respective rate of events A m . The solid line 
corresponds to a best fit assuming the functional form given 
in Eq. (|23|l . The dashed line highlights the transition point 
between the constant behavior and the 1/t decay. Note that 
J™p m (t)dt=l. 

For the shuffled catalog, p m (t) closely follows the the- 
oretical prediction of Eq. ([25)) and in particular the de- 
pendence on m - or rather on N through A. As shown 
in the inset of Fig. [T31 the different distributions — with 
the obvious exception of the observational cut-off at the 
largest time scales — collapse onto a single curve if t is 
rescaled by the respective rate A m . Here, A m is the mean 
rate of earthquakes above magnitude threshold m for the 
observation period. Notably, the main deviation from the 
stationary null model is that the location of the transi- 
tion point for the shuffled catalog is not at A m t — 1 but 
rather at A m t = 0.02. This is expected and due to the 
fact that the rate of seismic activity — which is preserved 
in the shuffled catalog — is not constant over time but 
exhibits large, correlated fluctuations as indicated, for 
example, by Omori's law [4rj| . 

The relative times between subsequent recurrences in 
the hierarchy can be analyzed in the same way as dis- 
tances were in Sec. IIV A 41 Fig. Q2] shows the PDFs for 
the ratios ti/ti+i for subsequent recurrences, i.e., recur- 
rences are ordered such that recurrence i + 1 directly 
follows recurrence i. For the cases shown, two power-law 
regimes seem to exist: For arguments smaller than about 
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FIG. 13: (Color online) Distribution of waiting time ra- 
tios ti/ti+i for m = 2.5. The straight line has slope -0.62. 
Open symbols correspond to the original earthquake catalog, 
filled symbols to the shuffled catalog (see text). Note that 
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Qi' {U/ti+i) decays with an exponent 5t 
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roughly independent of i, for larger arguments the decay 
is slower and the exponent apparently decreases further 
with i. Clearly, the broadest scaling regime materializes 
for ti/t2- 

The behavior of qf 5 {t 1 /t 2 ) for 10~ 3 < t 1 /t 2 < 1 could 
be compared to Eq. (f55|) , although the latter was derived 
for the translational invariant case. Equally important, 
Eq. (|38p only holds for the stationary null model. As dis- 
cussed above, seismic activity is not constant over time 
but exhibits large fluctuations. Fig. [T51 shows that these 
fluctuations as well as the loss of translational invariance 
are responsible for the behavior for arguments larger than 
about 10 -3 , since there is no observed difference between 
the original and the shuffled catalog. Yet, the deviations 
between the original data and the shuffled catalog for 
smaller arguments indicate that those short time differ- 
ences arise from the causal spatiotemporal organization 
of seismicity. 



D. Discussion 

It is important to discuss our results for the network of 
recurrences (or records) in view of what is known about 
causal connections between earthquakes. One specific 
type of causal connection is earthquake triggering. The 
increased seismic activity following large earthquakes — 
as described by the Omori law [46| leading to the iden- 
tification of aftershocks — is the most obvious exam- 
ple of earthquakes being triggered in part by preced- 
ing events. Aftershock sequences of small earthquakes 
are less obvious because the aftershock productivity is 
weaker, but can be observed after stacking many se- 
quences [55|] . Other approaches [16, 25] have generalized 



the definition of an aftershock so that an event can be an 
aftershock of more than one event leading to networks 
of earthquakes and aftershocks. Earthquake triggering 
is typically associated with stress changes which can be 
static stress changes imparted by the preceding shock 
or dynamic stress changes associated with seismic wave 
propagation or combinations of them as discussed, for 
example, in Refs. [56j, 15JJ, ]58|, |5£| |6fJ. The proposed 
physical mechanisms to explain earthquake triggering 
due to static stress change induced by a p rior event in- 
clude rate-and-state dependent friction [61| , crack growth 
[62l [H, HI] , viscous relaxation [65[ , static fati gue [66j , 
pore fluid flow (6?]], and simple sandpile models [68l] . 

Calculations of stress changes have been used to pre- 
dict the locations, focal mechanisms and times of future 
earthquakes (see Refs. [H, H3,[6!| for reviews). The suc- 
cess of this method is limited. Only about 60% of af- 
tershocks are located where the stress increased after a 
main shoc k ITOH : stress shadows are seldom or never ob- 
served [7J |72|; and the correlation of stress change with 
aftershocks is rather sensitive to the assumed slip dis- 
tribution [73[ . All of this could be due to the fact that 
most studies have neglected the influence of small earth- 
quakes and secondary aftershocks which can play an im- 
portant role [H^, [73] ■ Moreover, most studies have also 
neglected the influence of dynamic stresses radiated by 
seismic waves from (small or medium-sized) earthquakes 
which may also play an important role — even in the near 
field (see, e.g., Refs. [z! M, IzE Izl S HI)- In particu- 
lar, dynamic stress changes can dominate the triggering 
mechanism over a wide range of distances between 0.2 
and 50 kilometers from the fault rupture 81 1 . 

While it is not entirely clear how our results for the 
network of recurrences could allow one to distinguish be- 
tween the different types of stress changes associated with 
earthquake triggering, there are a number of currently 
unexplained observations that could be related to a par- 
ticular triggering mechanism. The excess of events with a 
large number of recurrences compared to the null model 
(see Fig. [5]) is one of them. Other examples include the 
correlations between the in-degree and the out-degree of 
a given event (see Fig. [TO] D) and the apparent invari- 
ance of the waiting time distribution with respect to the 
threshold magnitude (see Fig. [TT]) . The sensitivity of 
these properties as well as our other findings (especially 
the invariance of l*[m) with respect to the time span) to 
the triggering mechanism can be tested within the frame- 
work of the "epidemic type aftershock sequence" model 
which has been established as an improved stochastic null 
model for seismicity [82|, ]83| . It allows one to vary the 
spatial scaling of the triggered events depending on the 
assumed underlying triggering mechanism, namely static 
stress changes or dynamic stress changes [U Hj] . This 
will be the topic of a future publication. 

Finally, we would like to point out that simple and 
direct comparisons of our results for the network of re- 
currences (or records) with known results for aftershocks 
are not justified. This is due to the fact that recurrences 
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as defined by our method are at best a very small and 
non-random subset of what typically would be considered 
the set of aftershocks. Also, the power-law decay of the 
distribution of distances as shown in Fig. [5] occurs gener- 
ically for a wide class of processes due to the properties 
of records as discussed in Section Mil — independent of 
the specific properties of aftershock sequences described, 
for example, in Ref. [8l| . Similarly, the power- law decay 
of the distribution of waiting times (see Fig. [TT]) is also 
a generic property of records as discussed in Section Hill 
and is, thus, not related to the specific characteristics of 
aftershock sequences discussed in Refs. [H, Hf|- 

V. SUMMARY 

This paper provides a method to detect features in a 
temporal sequence of observations that can be plausibly 
attributed to causal dynamics even when the observer 
has no a priori knowledge of the underlying dynamics. 
Our starting point is to generalize the concept of a recur- 
rence for a point process in time to recurrent events in 
space and time. An event is defined to be a recurrence of 
any previous event if it is closer to it in space than all the 
intervening events; i.e. if it constitutes a record breaking 
event. Hence, the causal structure of events may be de- 
scribed as a network of events linked to their recurrences. 
Each event can have many previous events pointing to it 
(its potential causes) and many future events (its effects) . 
Causality can be plausibly inferred when the statistical 
properties of the network constructed using this method 
and the statistics of the records deviate strongly from 
those resulting from almost any acausal process. 

We derive analytically many properties for the net- 
work of recurrent events composed by random processes 
in space and time. In doing so, we develop a fully sym- 
metric theory of records where both the variable in which 
records occur and time, itself, are continuous. This sim- 
plifies the theory and in our view makes it more elegant. 
We discover a number of new analytic results for record 
breaking statistics. 

Many of those results are compared to properties of 
the network synthesized from time series of epicenter lo- 
cations for earthquakes in Southern California. Signifi- 
cant disparities that can be attributed to causality are 



mainly coming from the invariance of network statistics 
with the time span of the events considered. This is pre- 
sumably related to an observed hierarchy in the distances 
and times of subsequent recurrences. As a result a fun- 
damental length scale for recurrences is obtained solely 
from the earthquake epicenter data, which can be identi- 
fied as the rupture length. All these significant deviations 
disappear when the analysis is repeated for a surrogate 
in which the original magnitudes and locations of earth- 
quake epicenters are randomly "shuffled" . Almost all of 
the latter results are completely consistent with predic- 
tions from the acausal null model. Taken together these 
results suggest that causality in seismic dynamics may 
be much broader than any normative interpretation of 
"triggering" . 

Our results are generally robust with respect to modi- 
fications of the rules used to construct the network, e.g., 
using spatial neighborhoods such that the construction 
becomes symmetric under time reversal or taking into 
account magnitudes. All such modifications have the 
drawback that they do not define a record breaking pro- 
cess consisting of recurrences to each event. For seis- 
micity, our results are also unaltered if we exclude un- 
physical links with propagation velocities larger than the 
velocity of a "P wave" of about 6km/ sec (rs 0.1% of all 
links). This is also true if we restrict ourselves to veloc- 
ities smaller than the velocity of a shear wave of about 
3.5fcm/ sec which is often thought to be more relevant. 

By building certain specific features of causality into 
null models, it is possible to refine predictions and ex- 
amine what features in the network of seismicity are due 
to those aspects of causality and what are yet to be ex- 
plained. It remains to be seen how general our method 
may turn out to be. In principle it can be applied to any 
high resolution data set where events occur in space and 
time. Immediate applications may include analyses of 
other geophysical or astrophysical data sets, brain scans 
[87| . or analyses of models to validate or falsify them. 
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